Spark & Python Notebooks V: Decision Trees & Model Selection
The fourth episode in our Spark series introduced Logistic Regression with MLlib. This new notebook explains how to use the library to build a classifier using Decision Trees on a large dataset. It also shows how powerful trees are in order to understand our data and even perform model selection.
Spark & Python Notebooks IV: Logistic Regression & Model Selection
The third episode in our Spark series introduced the MLlib library and its Statistics and Exploratory Data Analysis capabilities. This fourth notebook explains how to use the library to build a classifier using Logistic Regression on a large dataset. It also describes two different approaches to model selection.
Newer
Older